Identifying Unknown Proper Names In Newswire Text

نویسندگان

Inderjeet Mani

Richard MacMillan

Susann LuperFoy

Elaine Lusher

Sharon Laskowski

چکیده

The identification of unknown proper names in text is a significant challenge for NLP systems operating on unrestricted text. A system which indexes documents according to name references can be useful for information retrieval or as a preprocessor for more knowledge intensive tasks such as database extraction. This paper describes a system which uses text skimming techniques for deriving proper names and their semantic attributes automatically from newswire text, without relying on any listing of name elements. In order to identify new names, the system treats proper names as (potentially) context-dependent linguistic expressions. In addition to using information in the local context, the system exploits a computational model of discourse which identifies individuals based on the way they are described in the text, instead of relying on their description in a pre-existing knowledge base.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Semantic Tagging of Unknown Proper Names

Implemented methods for proper names recognition rely on large gazetteers of common proper nouns and a set of heuristic rules (e.g. Mr. as an indicator of a PERSON entity type). Though the performance of current PN recognizers is very high (over 90%), it is important to note that this problem is by no means a "solved problem". Existing systems perform extremely well on newswire corpora by virtu...

متن کامل

Using Mutual Information to Identify New Features for Text documents of Various Domains

The task of identifying proper names, unknown words and new terms, is an important step in text processing systems. This paper describes a method of using mutual information to collect possible segments as candidates of these three feature types in a document scope. Then the construction and context of each possible feature is examined to determine its type, canonical form and meaning. Adding v...

متن کامل

Proper Name Extraction from Non-Journalistic Texts

This paper discusses the influence of the corpus on the automatic identification of proper names in texts. Techniques developed for the newswire genre are generally not sufficient to deal with larger corpora containing texts that do not follow strict writing constraints (for example, e-mail messages, transcriptions of oral conversations, etc). After a brief review of the research performed on n...

متن کامل

Extracting Names From Arabic Text for Question-Answering Systems

Tagging and extracting proper names is an important key for improving the effectiveness of questionanswering systems. The valuable information in the text usually is located around proper names, to collect this information it should be found first. By extracting proper names from the text we provide questionanswering systems with both the proper name found in the text, some information about it...

متن کامل

Automatic Processing of Proper Names in Texts

This paper shows first the problems raised by proper names in natural language processing. Second, it introduces the knowledge representation structure we use based on conceptual graphs. Then it explains the techniques which are used to process known and unknown proper names. At last, it gives the performance of the system and the further works we intend to deal with. or unknown. Some of these ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1993

Identifying Unknown Proper Names In Newswire Text

نویسندگان

چکیده

منابع مشابه

Automatic Semantic Tagging of Unknown Proper Names

Using Mutual Information to Identify New Features for Text documents of Various Domains

Proper Name Extraction from Non-Journalistic Texts

Extracting Names From Arabic Text for Question-Answering Systems

Automatic Processing of Proper Names in Texts

عنوان ژورنال:

اشتراک گذاری